[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

wooyeonlee0 · 2024-07-12T09:14:08Z

This PR adds a simple patch to raise an error to prevent users from hitting the hang error stated in #5814.
This error happens only when the skip speculation feature is activated and there are no generated draft tokens for "all" sequences in a step, and draft_tp > 1.

We may revisit this issue later because it's not resolved completely.

wooyeonlee0 · 2024-07-12T09:22:21Z

@cadedaniel @zifeitong @comaniac
This PR is to implement the second option (raise an error) in cade's suggestion.
Could any one of you review this?
Maybe I can revisit later to completely fix it, following the first option.

We need to either fix it or raise an error so that users don't hit this.
#5414 (comment)

comaniac

LGTM. Can we enable the test and expect to catch this exception?

wooyeonlee0 · 2024-07-15T07:18:12Z

@comaniac Thanks for the review :)
I've added the test code that catches the error.

But I'm not sure, the CI seems to have stopped.

wooyeonlee0 · 2024-07-15T12:01:36Z

@comaniac The CI test passed :)

comaniac

LGTM

comaniac · 2024-07-15T16:17:47Z

vllm/spec_decode/spec_decode_worker.py

+        if not self.allow_no_draft_tokens and sum(
+                proposals.proposal_lens) == 0:


I'm a bit worry about the overhead that sum brings, but I feel it's fine for now given that it won't be triggered with draft model TP=1. wdyt @cadedaniel?

yeah, is it possible to store this field in proposals when we create proposals? that way we don't need an additional CPU-GPU-CPU sync

cadedaniel · 2024-07-15T18:29:13Z

vllm/spec_decode/spec_decode_worker.py

        self.scorer_worker = scorer_worker
        self.disable_by_batch_size = disable_by_batch_size or float("inf")
        self.spec_decode_sampler = spec_decode_sampler
+        self.allow_no_draft_tokens = allow_zero_draft_token_step


nit: can we mark this private, e.g. _allow_no_draft_tokens? we should have done this for all properties but we missed it

cadedaniel · 2024-07-15T18:33:14Z

vllm/spec_decode/spec_decode_worker.py

+        if not self.allow_no_draft_tokens and sum(
+                proposals.proposal_lens) == 0:


yeah, is it possible to store this field in proposals when we create proposals? that way we don't need an additional CPU-GPU-CPU sync

wooyeonlee0 · 2024-07-18T01:32:46Z

Thanks for the review! I'm gonna handle it right now :)

wooyeonlee0 · 2024-07-19T04:32:33Z

@comaniac
I re-initiated the CI test multiple times, but CI failed in one of the following cases: 'build image' or 'documentation build'.
Link: https://buildkite.com/vllm/ci-aws/builds/5151#0190c4c4-6cde-4a9f-a587-63f29a3b5dbd
Link: https://buildkite.com/vllm/ci-aws/builds/5218#0190c8b9-3ee9-4548-a691-cbe56abcff24

Is there any problem in CI now?

comaniac · 2024-07-19T04:43:51Z

The failure you posted seems random. I'll monitor the current CI run and manually retry failed ones.

wooyeonlee0 · 2024-07-19T08:14:27Z

@cadedaniel @comaniac I've updated the code as your suggestion and I think the code passed the test.
Would you take a look? :)

@comaniac CI has finished. Would you retry the 'documentation-build' test? Thank you!

cadedaniel · 2024-07-19T08:21:54Z

@simon-mo can we get a force merge, doc build seems broken

…-project#6369)

…-project#6369) Signed-off-by: Alvant <[email protected]>

…-project#6369) Signed-off-by: LeiWang1999 <[email protected]>

fix it

dadfa82

yapf

ad8390c

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from b9af049 to ad8390c Compare July 12, 2024 09:29

wooyeonlee0 mentioned this pull request Jul 12, 2024

[Bug]: Test_skip_speculation fails in distributed execution #5814

Closed

wooyeonlee0 added 4 commits July 12, 2024 21:52

fix

a934b12

allow zero token step for other cases

e93781d

update comment

10e6441

yapf

6edf8fc

comaniac reviewed Jul 12, 2024

View reviewed changes

wooyeonlee0 added 4 commits July 15, 2024 10:10

test_skip_speculation

5398d7a

error on test_skip_spec

a70ccc9

add comment

c4b6f72

yapf

02dc475

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from 96ec39e to 02dc475 Compare July 15, 2024 02:11

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 15, 2024

mark. need 4 gpus

c2382b5

comaniac approved these changes Jul 15, 2024

View reviewed changes

cadedaniel reviewed Jul 15, 2024

View reviewed changes

no_proposals flag in SpeculativeProposals

fa1f463

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch 5 times, most recently from 2861e99 to 3876858 Compare July 19, 2024 03:54

mypy yapf

4b338d2

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from 3876858 to 4b338d2 Compare July 19, 2024 04:33

cadedaniel approved these changes Jul 19, 2024

View reviewed changes

cadedaniel enabled auto-merge (squash) July 19, 2024 08:21

simon-mo merged commit a921e86 into vllm-project:main Jul 19, 2024

wooyeonlee0 deleted the temporal-fix-skip-spec branch July 19, 2024 13:16

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

c4ac0f2

…-project#6369)

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

189d288

…-project#6369) Signed-off-by: Alvant <[email protected]>

LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

64e09bf

…-project#6369) Signed-off-by: LeiWang1999 <[email protected]>

		if not self.allow_no_draft_tokens and sum(
		proposals.proposal_lens) == 0:

Uh oh!

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

Conversation

wooyeonlee0 commented Jul 12, 2024

Uh oh!

wooyeonlee0 commented Jul 12, 2024

Uh oh!

comaniac left a comment

Choose a reason for hiding this comment

Uh oh!

wooyeonlee0 commented Jul 15, 2024

Uh oh!

wooyeonlee0 commented Jul 15, 2024

Uh oh!

comaniac left a comment

Choose a reason for hiding this comment

Uh oh!

comaniac Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

cadedaniel Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

cadedaniel Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

cadedaniel Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

wooyeonlee0 commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wooyeonlee0 commented Jul 19, 2024

Uh oh!

comaniac commented Jul 19, 2024

Uh oh!

wooyeonlee0 commented Jul 19, 2024

Uh oh!

cadedaniel commented Jul 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wooyeonlee0 commented Jul 18, 2024 •

edited

Loading